Search CORE

7 research outputs found

elPrep 4 : a multithreaded framework for sequence analysis

Author: Costanza Pascal
Decap Dries
Fostier Jan
Herzeel Charlotte
Verachtert Wilfried
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

We present elPrep 4, a reimplementation from scratch of the elPrep framework for processing sequence alignment map files in the Go programming language. elPrep 4 includes multiple new features allowing us to process all of the preparation steps defined by the GATK Best Practice pipelines for variant calling. This includes new and improved functionality for sorting, (optical) duplicate marking, base quality score recalibration, BED and VCF parsing, and various filtering options. The implementations of these options in elPrep 4 faithfully reproduce the outcomes of their counterparts in GATK 4, SAMtools, and Picard, even though the underlying algorithms are redesigned to take advantage of elPrep's parallel execution framework to vastly improve the runtime and resource use compared to these tools. Our benchmarks show that elPrep executes the preparation steps of the GATK Best Practices up to 13x faster on WES data, and up to 7.4x faster for WGS data compared to running the same pipeline with GATK 4, while utilizing fewer compute resources

Ghent University Academic Bibliography

Directory of Open Access Journals

FigShare

Multithreaded variant calling in elPrep 5

Author: Costanza Pascal
Decap Dries
Fostier Jan
Herzeel Charlotte
Verachtert Wilfried
Wuyts Roel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK4. This makes elPrep 5 a suitable drop-in replacement for GATK4 when faster execution times are needed

Ghent University Academic Bibliography

Directory of Open Access Journals

Applying Reuse Contracts in a Product Line Approach

Author: Carine Lucas
Kim Mens
Patrick Steyaert
Wilfried Verachtert
Publication venue
Publication date
Field of study

This paper raised some questions which we are interested to discuss during the worksho

CiteSeerX

A software package for efficient patient trajectory analysis applied to analyzing bladder cancer development

Author: Charlotte Herzeel
Ellie D’Hondt
Frank Van der Aa
Murat Akand
Roel Wuyts
Valerie Vandeweerd
Wilfried Verachtert
Wouter Botermans
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/11/2023
Field of study

Directory of Open Access Journals

Data Science in Healthcare: Benefits, Challenges and Opportunities

The advent of digital medical data has brought an exponential increase in information available for each patient, allowing for novel knowledge generation methods to emerge. Tapping into this data brings clinical research and clinical practice closer together, as data generated in ordinary clinical practice can be used towards rapid-learning healthcare systems, continuously improving and personalizing healthcare. In this context, the recent use of Data Science technologies for healthcare is providing mutual benefits to both patients and medical professionals, improving prevention and treatment for several kinds of diseases. However, the adoption and usage of Data Science solutions for healthcare still require social capacity, knowledge and higher acceptance. The goal of this chapter is to provide an overview of needs, opportunities, recommendations and challenges of using (Big) Data Science technologies in the healthcare sector. This contribution is based on a recent whitepaper (http://www.bdva.eu/sites/default/files/Big%20Data%20Technologies%20in%20Healthcare.pdf) provided by the Big Data Value Association (BDVA) (http://www.bdva.eu/), the private counterpart to the EC to implement the BDV PPP (Big Data Value PPP) programme, which focuses on the challenges and impact that (Big) Data Science may have on the entire healthcare chain

Crossref

Pure OAI Repository

Archivio istituzionale della ricerca - Università di Cagliari

Online Research Database In Technology